11 research outputs found
Toward Order-of-Magnitude Cascade Prediction
When a piece of information (microblog, photograph, video, link, etc.) starts
to spread in a social network, an important question arises: will it spread to
"viral" proportions -- where "viral" is defined as an order-of-magnitude
increase. However, several previous studies have established that cascade size
and frequency are related through a power-law - which leads to a severe
imbalance in this classification problem. In this paper, we devise a suite of
measurements based on "structural diversity" -- the variety of social contexts
(communities) in which individuals partaking in a given cascade engage. We
demonstrate these measures are able to distinguish viral from non-viral
cascades, despite the severe imbalance of the data for this problem. Further,
we leverage these measurements as features in a classification approach,
successfully predicting microblogs that grow from 50 to 500 reposts with
precision of 0.69 and recall of 0.52 for the viral class - despite this class
comprising under 2\% of samples. This significantly outperforms our baseline
approach as well as the current state-of-the-art. Our work also demonstrates
how we can tradeoff between precision and recall.Comment: 4 pages, 15 figures, ASONAM 2015 poster pape